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ABSTRACT 



The hypothesis that conventional approaches to 
evaluating contaminants in performance appraisal overlpok important 
individual ratee effects was examined* A rating form was developed 
that consisted of the following dimensions and behaviors: Warmth; 
guided discourse or indirect teaching methods; control of subject 
matter; enthusiasm and reinforcing; organising and managing; 
presenting and explaining; evaluating; and advising and counseling, 
w Administration o£ the form to evaluate 23 instructors resulted in 
approximately 1, 500 observations per semester/The reliability of the 
tdrm and its factor stability were assessed, and possible , / 

contaminants were checked to assure tftat the evaluations were more 
likely to result from, the instructor's performance than from .student 
or course factors. It was ftajnd that 8.6 percent of the instructor 
ratees had persistent and significant contaminants associated with 
their evaluations; a looser definition of "persistent" pushes the 
figure to 34.7 percent. It is suggested that the evaluations may not 
be assessing performance accurately because of ratee contaminants, 
.including expected, grade in the course, the time at which the course 
begins, the time and effort required of the student, and the 
student's major. These contaminants occurred in spite of the fact 
that the "instrument was found to have face validity, factor 
stability, and internal consistency. It is proposed that adjustments 
could be ihade on an individual basis and only t for those contaminants 
that are persistent for each instructor . However, *what is needed is a 
practical decision rule that would permit users of such evaluations 
to make necessary adjustments in the appraisals to correct .for such 
persistent effects. Interactions among contaminants should also be 
addressed. (SW) * 
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* PERSISTENT RATEE .CONTAMINANTS IN PERFORMANCE APPRAISAL . 

« Abstract * v 
This paper examines the hypothesis that conventional approaches 
to evaluating contaminants in performance appraisal overlook important 
individual ratee effects. "Important" in that they may be different 
than those N identified for the total set of ratees and in that they per- 
sist over time. A form was developed and applied to 23 instructors, ^ 
' f 6% each of four semesters resulting in. at least 1500 r observations pei*K 
vsemester. The .existence of persistent ratee contaminants % is demonstrated, 
-Further, the contaminant set identified for . the individual ratees is, not 
identical to that identified .by the total ratee set. Implications are 
also briefly. discussed. ** • 



> • 



9 

ERIC 



PERSISTENT -RATEE CONTAMINANTS IN PERFORMANCE ^APPRAISAL 

Good or effective teaching; is universally agteed upon as important 
and desirable.' Yet, 'despite an increasingly large body of literature on 
the subject (see Selected Bibliography for examples), little is, known- 
precisely about what'a good or effective teacher is. Efforts to isolate 
the essential differences between good and poor • teachers -are numerous,^ 
especially at the ^elementary and secondary school levels, but, while pro- 
gress has been made, much remains to be done. It is not yet very clear ' 
which portions of a teacher's behavior are essential for learning and 
which are essential for student satisfaction with 'the learning' process , - 

• . ' " ' ^ ' . ' \ 

for instance. 

Colics afid universities have developed and/or are developing per- 
formance appraisal instruments "to prqvide some student involvement in the 

evaluation of teaching. These instruments are being used increasingly by 

f .■•*,'. 
administrators as at least one source of information upon which to make , 

personnel decisions. Departmental" or_ subj ect matter specific instruments 

are rarely developed despite the obvious problems of aggregation and 

applicJ^irity: which result from using a university-wide .instrument . But 

administrative/ actions based on even carefully developed instruments may 

be in error due to the existence of persistent rat£e contapf$Vants . The 

/purposes of this .paper are first to demonstrate the existence of such per 

distent effects.and then .to brief ly -discuss their implications, r 



The Form 



- The particular form used in ^his research was developed as a depart- 
mental or subject matter specific form by a faculty committee consisting 
of a lawyer, an expert in performance evaluation in both the private and 



public sectors, and a faculty member familiar with basic research in teach- 

* \ 4 

in g- evaluation. This committee examined the literature and several existing 
instruments and decided to strive fof relevancy in the items used. There- 
fore, items which the literature suggested as being non-relevant were im- . ^ 
mediately dropped from consideAt ion— dress , hair length, sex, and^the, 
like. A tentative" list of items covering "dimensions" and "behaviors" 
identified in the literature was developed. The "dimensions" and "behav- ^ 
.iors" were: ,(1) warmth; '(2) guided discourse or indirect method of teach-, 
ing; , (3) -control -or grasp of subject matter;" (A) enthusiasm, motivating 
and reinforcing;' (5). organizing," coordinating .and managing ;* (6) presenting, 
explaining and demonstrating; (7) evaluating; and (8) advising and counsel- 
ing.^ That tentative list^was submitted to the' departmental faculty and a . 

form finalized following faculty review, 

* " .* 

The form was used in all sections of all courses taught by the depart- 
ment each regular semester for two years. In each semester, there were 
1500 or more observation^^tained (student forms, ncft separate students, 
as a single student may have >had more than one instructor). A total of 23 
instructor^ were evaluated all four semesters; to assure comparability, onl 
these 23, were used in the analysis although more than 30 taught each o£ v the 
semesters. v 

A split-half reliability coeff icient * (Spearman-Brown) was calculated 
for each semester. 'The results were: 0.83 (N=2007) ; ,0.84 - (N»1627) ; 0.84 
(^1798); and 0.84 (N=1499). ' Factor stability was also found to exist as 
can be readily seen in Table 1. These data suggest that the internal con- 
sistency and stability of the form was acceptable or even excellent. 

To check for possible contamination, several items of student back- 
ground and perceptions about the^ course were obtained and .were correlated 
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TABLE 1 



FOUR SEMESTER FACTOR ANALYSIS 
(Items Grouped by Highest Loadirtg on Rotated Factor Matrix) 
(Principal Components Method ; Varimax Rotation) 
*~Re~pr eseri t s ; '~One S emesrer-) — * - 



5 

9 
14 
22 



Factor 



Item Openness 



Interest in View bT 

Students Testing Mechanics-' Material 



Main Point of Item where S refers 
to students and I to instructor * 



k k k k 

■k -k iz -k 

k k k k 
k 



p * S T s feel free* to ask questions 

. ' Instructor asked challenging questions 

. o Instructor open to other views 

* * v > « Instructor used Socratic method 



* 



k k k 



Ins t r u c t o r_pr o v id ed_ gu id an ce_ _ 

I met with S outside of class 
I_£ersonallY_i2terested_in_S^_ 

examination feedback useful 



2 
15 



k k k k 
k k k k 



10 
16 



20_ 

4' 

8 
11 
19 
21 
25 



k k k k 
k k k k 



exam questions o were clear 
I fair in grading examinations 



. • k k k k exams_were*a^ 

on , * k k k k < course requirements and grading clea 



* * * 

k k k 

k k k 

k k k 

k k k 

k k k 



k 
k 
k 



S- 

Instructor ^organized 

course objectives clear 

Instructor was well prepared 

I stressed important material 

I accomplished objectives of course 



1*2 
13 



k k k k I aroused S interest in material 
k k k k conce£ts clearly presented _^ 



1 
7 

,13 
23 
24 



k 

k 
k 

k 



Instructor kn6w material 
L displayed interest in material 
•I used examples to clarify material 
I enthusiastic about material 
I presented information not in text 



five factors were' obtained for three semesters and four 



'NOTE: Using "semester, the "Mechanics" factor was not 
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with the overall ratings^ \lhich were' given to the inst^^tors. £he data ^ 

presented in Table 2 demonstrable. that whiJLfe several highly significant . ' 

contamination effects are present, they are quite small and, indeed, singly 
t> a ■ - * 

'would account for extremely small proportions of variance in the overall 

*~ ■ • 
ratings. These data, too ; , suggest that the form used was acceptable ar 

even better. 0 * 

Ratee Contaminants, 

V * • 

Thus far what has been done is 'rather conventional (with-- the exception 

- • ^ 

of four replications of factor analysis) for the evaluation of teachings ' 
A faculty committee was used to assure face validity and acceptance in the 
form developed. The reliability of the -form and its factor 5ft ability were 
assessed to assure the internal' consistency and stability of the form. 
Possible contaminants were checked to assure that the evaluations received 
were more likely to be the result of the performance of the instructor 4 than 
characteristics of .the 'Students or the course. In this instance , all in- 
dicators were tint the 1 form was acceptable but that slight contaminants 
* appeared to exist — the grade which the student expects to receive in the v 

course, the sex of -the instructor, and the time of day at which the course 

«*• 

begins. 

But what if one or more of the instructors performed in such a way as 
to evoke. a persistent pattern of response* from students with regard to the 
evaluation — a pattern reflecting not performance but* some characteristic of 
the student? What' if an instructor constantly made sexist remarks so that 
females '^underrated the instructor while males were mixed in their reactions? 
Such ratee specific contaminants^ might go undetected using a conventional • 
ppp.roach to evaluate the form such as outlined above. The usiial assumption 
seems to be, that such effects might exist but would not^persist. Therefore, 



TABLE 2^ 
"CONTAMINATION ANALYSIS 
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Background Factor 

Major . 

rClassif ica-tion 

Time Course Begins 

Reason for Selecting 
Coi;rse 

Career Plans 

Sex*, 

Sex of Instructor' 

Hatch between Student 
Sey. and Instructor 
Sex 

Overall Grade Point 
Average 

•Expected Grade in 
Course 

Difficulty of Course ' 

Time and Effort 

Required in Course 



Semester 



1 



4 



+0.0050 
+0.0290 ' 
'+0.0631* 

-0.0049 
+0.0154 
+0..0294 
+0.1057*** 



-O.O98O -0.0548 
-0.1176*** -0.0253 
+0.0008 +0.0745* 



+0-.0045 
-0.0409 
+0.0227 



-0.0482 
-0.0089 
' +0.061 2 



-0.0032 

+0.0.L91« 

,+0.1362*** 

c -0.0308 
-0.0346 ' 
+0.0290 



+0.1150*** +0.1160*** +0.lV-16*** 



+0.0686** +0.0903*** +0.1148*** +0.0397 



-0.0294 



'+0.0778** +0.0636** -0.0281 



+0.1363**** +0.1905***- +0.2304*** +0.1248*** 
-0.0395 -0.0918*** -0.0875*** +0.0266 



-0.0107 » -0.1035*** -0.1439*** +0.0244 



* 0.01 « 
** p^ 0.001 . 
*** p^O.0001 

Sex was coded 1 for males and 2 for femal'es. 

/ 
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an examination of .those -effects by instructor (N=23) for each semester (N-4) 

o . 

was performed. The results are summarized in Table 3. _ : : - 

As 'can be seen from the data in Table c 3, 8.6 percent of the instructor 

« • * ** 

ratees have per si^Krtftf^n tarn with their evaluations ^here 

persistent^is defined to mearv significant (p£0.1(?) irt all four semesters. 
^ f , • ; , : . m * > - ^ . 

If a slightly l3<2ser definition, of persistent is used — significant in three^ 

of the four^semesters, this increases to 34.7 percent, over a third of the 
instructors! This means, of, course, that for those ratees, the evaluations 
m<jy not be, assessing performance accurately. 1 N 

One of the twelve possible contaminants is involved for six of the 
eight faculty— expected grade in course—and the correlation is as high as 
£.59 for bne -of the instructors in* one semester. The other* persistent con- 
taminantiT are the time at which the course begins, the time and effoft re- 
quired of the student in the course, and. the student's major. These latter^ 
two were not identified as- persistent when the total data set was analyzed. 
Further, the* sejc of the instructor, which had been identif ied^as a. persistent 
contaminant^for the total data set, does not appear as one on an instructor- 
;, by~instructor basis. - . * 

, • Conclusions - 

A 

These dat-a .clearly demonstrate the existence of persistent ratee con- 
taminants* even- for an instrument which is acceptable to those for whom it 
is designed to be applied and which appears to be reasonably good, from a 
psychometric standpoint . . This means that, even if conventional approaches 

7 4 f 

are used to assure good evaluat ions :a are being conducted, some ratees 9 may be 
receiving improper evaluations — higher or. lower than .their performance alone 
would suggest. V • 

What -can be done? First, even if the total data set is used in a 



Ratee 



RATEE CONTAMINATION ANALYSIS 



1 * „ t J ' * * * k f,h,j 

2 ^ a jb,i 



3 %, ©®- Qfi b^f.h^) a(c),i,k 

A % f,h,j e,f,h,j ' b,c,i,k v k 

6 \ r * - a.f.h.l c i 

J \ b <© & k (ski j 

8 , c.f.h.iAk.i c7h,i e ^j) 

10 - *a,b,f,h,i a,b,d c,j d 

11 • j ' a,f,h ^ ... d,i • 

12 b,c ' f,h . t I a,c,d,h,j 

13 • - * b,J,k " 



15 a,i b c,I d,f,h,k,l 

16 *. . ,f,h(3),k,i ' f,hQ) cAl " • (kk 

17 0,c,d,iq),k @, f ,Y gb.d.Yihai <3) , 

18 • b,c,k,l h.,j a,b,c,a,i,k-,l 

19 4 J b,c,f,h,j b,c,e,f,h,l 
20. a ' j k ■ - " d 

21 c,i,j,l b,c,j . " 1 

22 - e @ a,c@,k . -a,b,c,d"@ 

23 f a,d,i« . f,h,j-,k 

NOTE: Only those contaminants significant at the 0.10 level are shown. 

a = major h = match between student 

b « classification sex and instructor sex 

c = time course begins i « overall grade paint ave. 

*• d a reason for selecting course j = expected grade in course 

e = career plans . k = difficulty of course 

f = sex of student (1-male; 2=f emale) 1 fa time and effort required 

g = sex of instructor (l-m;2=f) in course a 

Q denotes persistent contaminant for particular ratee/instructor • 
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regression model to adjust for possible contaminants, proper results may. 
not be obtained. As indicated by these data, contaminants whicli appear 
significant for the total data set may :;0t be persistent ones for ratees 
and some which are persistent for ratees may not .show up for the total 
data set*. Second, the contaminants which emerge with each administration 
.of an instrument are likely both to. involve more ratees than those for 
whom persistent contaminants exist and to involve more possible contam- 
inants than actually will persist. This would seem to suggest that ad- 
justments would have to be made on an individual basis and only for those 
contaminants which are persistent for each individual. This is clearly 
a monumental p task and "one which is not likely to be undertaken by many 
organizations,. — to *». . 

What is needed is a practical decision rule which will permit users 
of such evaluations to make necessary adjustments in the appraisals to 
correct for such' persistant effects without all of this effort, if possible. 
However, before that cart be done the question of interactions among these 
contaminants must- also be addressed. It is highly likely that "reason for 
choosing the course," "career plans," "classification," and "major"*will 
display some interaction which may heighten or lessen the problem identified 
here. The next step in the- research project being reported here is to ex- 
amine those interactions and to move toward the establishment of a decision 
rule f or Jjiea-ting the problem. 



Brandenburg, D. and Rammers, P., "Rating scales for instructors , "Ed_uca^. 
tional ^retotetratlqn and Sup ervision , Vol. 13 (19 27), W-lbbt. 

Breed F. S. "Factors contributing to success in college teaching," Journal, 
of Educ ational Research, Vol. 16 (1927), 247-253. >. 

Campbell, D. T. and Fiske , D.' W., "Convergent and discriminant validation 

by the multi-trait, multi-method matrix," Psychological Bulletin, Vol. 
t j 56 (1959), 81-105. 

Carrier, N. A., et aJL , "Course evaluation: when?" Journal of Educational 
' ' Psychology,' Vol. 66 (1974), 609-613. 

Champlin, C, "The preferred college professor," School and Society, Vol. 
27 (1928), 25-37. 

Charters, W. W., "The improvement of college teaching," School and Socioty_, 
Vol! 13, (1921), 494-497. 

A 

Coats W. D. , et al., "Student perceptions of teachers: a factor analytic 
gfndy,"' Journal of Educational Research , Vol. 65 (1972), 357-360. 

Davsnport, M. M . , "Methods for evaluating good teaching," .Journal of Ani^ 
raal Science, Vol. 24 (1965), 1209-1214. 

f 

Davis ,*C. 0., "Our best teachers," School Review, Vol.' 34 (1926), 754-759. 

Elbe, K. E . , improving college teaching," Phi Dgl^ Kappairr^"~5T- (19 7 1.) , 
28*3-285. 

^ — "' 

Frey., P. W, , ."Student ratings ofti^clftng: validity of several rating factors 
Sc ience' , Vol. 182 (19J3)^8l-;85 . 

Cage , N . L. (ed ^Handbook of Research ,on Tea ching (Chicago : Rand McNally , \ 
'l963)^^ ' ' 

Isaacson, R.'l., et'al.,' "Dimensions of student evaluations of teaching," 
" ' 'Journal. o_f Educational Psychology , Vol. 55 (1964), 344-351. 

-Tonkins, J. R. a^d Davis ... S , L,5 "Influence' of student behavior on teacher's 
self-evaluation'," Journal of Educational Psychology , Vol. 60 (Uby;, 
439-442.- 

Kent, L. , "Student evaluation of teaching," The Educational Record^ Vol. 47 
' (1966) , 376-406 . * r ~ 

Leventhal, D. F . , et al.', "Student 'evaluations of teacher" beh 
. estimations of real-ideal discrepancies: a "itique of "teacher rating . 
mPt-horis." Journal of Education al Psychology , Vol- 62 xu«»-r. . 

Marks , \E. , "Individual differences in Perceptions of the college environment, 
jjjrnal, of Education al Psychology , Vol. 61 (1970), 270-Z/9. 



- Maslow, A. H. and Zimmerman, W. , "College* teaching ability,' scholarly 

activity, and personality/' Journal of Educational Psychology , Vol. 

McFillen,' J. M., .'^.n analysis of facto ( r ^congruency and subscale reliability 
for \a course evaluation ques tionnaijre , 11 proceedings of the American 
Institute for Decision Sciences , 1976. "~ 

\ J ■ " 

Nash, H. B. and Bush,~*F. R, , ' "To wluVt extent do grades influent-, student 
ratings of instructors?" Journal of Educational Research / Vol. 21 . 
(1930), 314-316. " ; * - 

-NunnaJly, J. C, at al. , "Factored scales for measuring characteristics 

of college -environments," Educational and Psychological Measurement , 
. * Vol. 23 (1963), 239-248. \ t ~ ~^ 

Pambookain, H. S., "Initial level of student evaluation of instruction as 
, a source of influence on instructor change after feedback," Journal c 
of Educationa l Psychology , Vol. 66 (1974)^ 52-55. ■ 

r 

Perry, R. P. et aL, "Effect of prior teaching evaluations and lecture pre- 
sentation on ratings of teaching performance," Journa l pf Education al 
j Psychology , Vol. 66 (1974) ^ 851-856. / 

P^essey, S. L . , et al. , "Research adventures iiv university teaching"," School 
and Society , Vol.. 20 (1924) / 635-63$.' 

Rees, R. D., "Dimensionjpo^ point of view 'in' rating college 

teachers, " Journal of Educatiotra^ gs ychology , Vol. 60 (1969), 476-482., * 

Rodin, M. <oqc1 Rodin, B., "Student evaluations oF^fe^achers , " Science , Vol. 
.177 (1972), 1164-^.166. * " ~ 

Spencer, R..E. and Aleamoni, L. W. , "A studnet course evaluatiorf-qjjes tion-< 
naire," J ournal of Educational Measurement , Vol . 7 -(1970), 209~^]fe^ - 

* ; * 4 

Sullivan, A. M. and Skanes. G. R. , "Validity of' student evaluation of teach- 
•~ ing and the characteristics of ' successful instructors," Journal of 

Educational Ps ycholog y", -Vol. 66 (1974), 584-590. ~~ 

Van Fleet, D. D. , "Salary administration, in higher education: a tentative 
plan," AAUP Bulletin , Vol. 58,(1972), 413-418/ . 

Van Flcui-t, E . , "Q-sort in teacher evaluation," The Delta Pi Epsilgff Journal , 
Vol. 14 (1972), 1-17. * ^ 

Veldraan, D. J. and Peck, R. F., "Influences on pupil evaluations of student 
teachers," Journal of Educational Psychology , Vol. 60 (1969), 103-10,8. 

Woodi'ing, P. , The higher l earning in America: £ reassessment (New Yfrrk: 

McGraw-Hill, 196») ... c • A . • .„ r ' 



13 .. . 



